feat(derivation): validator cleanup + Path A/B verification mode + derivation-driven tag management#951
feat(derivation): validator cleanup + Path A/B verification mode + derivation-driven tag management#951curryxbo wants to merge 4 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (10)
📒 Files selected for processing (42)
💤 Files with no reviewable changes (8)
✅ Files skipped from review due to trivial changes (2)
🚧 Files skipped from review as they are similar to previous changes (28)
📝 WalkthroughWalkthroughExports batch encoding helpers; implements SPEC-005 local verification, L1 reorg detection, tag-advancer for safe/finalized heads, finalized ticking, derivation refactor (remove validator), DB and CLI changes, tests, and ops/docker updates. ChangesSPEC-005 Local Verification & Infrastructure Cleanup
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Introduces Path B verification mode (SPEC-005 §4) as a startup-time
mutually-exclusive alternative to Path A. Selected via
`--derivation.verify-mode` (default "pathA", preserves current
behaviour). The two modes do not interact: under no circumstance does
the node fall back from one to the other; switching requires changing
the config and restarting.
Path B mode:
- Skips beacon-side blob fetch entirely.
- Reads L1 commitBatch tx calldata + tx.BlobHashes() only.
- Reads local L2 blocks for the batch range, replays the sequencer's
encoding (parsingTxs + buildBlockContext + BatchData) to rebuild
the blob bytes.
- Compresses + builds a BlobTxSidecar; compares its versioned hashes
against the values from L1 calldata.
- Reuses the shared verifyBatchRoots for state/withdrawal root
verification (independent of blob).
Touches:
- common/batch: export ParsingTxs / BuildBlockContext (rename from
package-private). No semantic change for tx-submitter.
- node/derivation/batch_info.go: add BatchInfo.blobHashes field;
add ParseBatchMetadataOnly (calldata-only parse, no blob).
- node/derivation/config.go + flags: add VerifyMode + flag with
fail-fast validation on unknown values.
- node/derivation/derivation.go: dispatch in main loop based on
verifyMode; populate blobHashes in Path A's fetch helper too so
BatchInfo is consistent across modes.
- node/derivation/verify_path_b.go: fetchBatchInfoPathB +
verifyBatchContentPathB + fetchLocalLastHeader.
- node/derivation/metrics.go: path_b_triggered_total /
path_b_failed_total counters.
Verified clean: `go build ./node/... ./common/...`, `go vet`,
test compile.
Refs: morph-l2/morph-specs SPEC-005 §4 / §5 / §6
Resolves the WIP item flagged in PR #951.
Brings in 6 commits from base feat/multi_batch since the previous merge (09a2c72) so PR #951 stops reporting CONFLICTING: - 4be9cda update challenge handler dep - c6f7cc2 add v0/v1 blob check - 7ec11cf Merge branch 'feat/multi_batch' of github.com:morph-l2/morph - fa9f4e8 fix submitter replay batch with config max blob count - 1fe0d74 add rustc version desc - fbd1bee chore: align go-ethereum submodule with origin/main Single content conflict in common/batch/batch_cache.go: base added a new sealEffectiveBlobCount function in the same hunk where SPEC-005 exported BuildBlockContext (PR #935-derived rename). Resolved by keeping both: sealEffectiveBlobCount stays where base placed it; the exported BuildBlockContext (with SPEC-005 docstring) follows. base's lowercase "buildBlockContext" comment was dropped (now stale -- we renamed it). All other touched files auto-merged: contracts/Rollup.sol, BlobVersionedHashLegacy.t.sol (added by base), prover/Cargo.{toml,lock}, prover/{,bin/challenge/}rust-toolchain (deleted by base), tx-submitter/mock/rollup.go, common/batch/batch_restart_test.go. go build ./common/batch/ ./common/blob/ ./node/... ./tx-submitter/... ./oracle/... -- clean. go test ./node/derivation/... -- 22 tests PASS (incl. 9 added in b9ad14e). common/batch tests hang on a pre-existing txSenderCacher goroutine leak unrelated to this merge (reproducible on origin/feat/multi_batch in isolation); not introduced here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
node/derivation/metrics.go (1)
20-42:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winRun
gofmton this file to unblock CI.The current pipeline is failing due to formatting in this file; apply
gofmtand re-run checks.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@node/derivation/metrics.go` around lines 20 - 42, The file's formatting is out of sync with gofmt; run gofmt on the Metrics struct file (e.g., the Metrics type and its fields like L1SyncHeight, PathBTriggered, SafeAdvanceTotal, etc.) to normalize spacing and alignment, then re-add the formatted file to your commit; you can run `gofmt -w node/derivation/metrics.go` (or `gofmt -w .` at repo root) and re-run CI.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@node/cmd/node/main.go`:
- Around line 122-129: The Syncer has unsynchronized concurrent reads/writes to
the uint64 field latestSynced (accessed by fetchL1Messages(), the background
goroutine started in Start(), and readers via LatestSynced()), causing a data
race; fix by protecting all accesses to latestSynced: either add a sync.Mutex
(e.g., s.mu) and wrap every read/write in fetchL1Messages() and LatestSynced()
with s.mu.Lock()/Unlock(), or convert latestSynced to a uint64 and use
sync/atomic functions (atomic.LoadUint64/atomic.StoreUint64) in
fetchL1Messages(), Start() and LatestSynced() so all reads/writes are atomic and
race-free. Ensure you update any logging that reads latestSynced to use the
protected/atomic read.
---
Outside diff comments:
In `@node/derivation/metrics.go`:
- Around line 20-42: The file's formatting is out of sync with gofmt; run gofmt
on the Metrics struct file (e.g., the Metrics type and its fields like
L1SyncHeight, PathBTriggered, SafeAdvanceTotal, etc.) to normalize spacing and
alignment, then re-add the formatted file to your commit; you can run `gofmt -w
node/derivation/metrics.go` (or `gofmt -w .` at repo root) and re-run CI.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4791e23f-6353-4395-8d21-d6788540fd53
📒 Files selected for processing (35)
Makefilecommon/batch/batch_cache.gonode/blocktag/config.gonode/blocktag/service.gonode/cmd/node/main.gonode/derivation/batch_info.gonode/derivation/config.gonode/derivation/config_test.gonode/derivation/derivation.gonode/derivation/finalizer.gonode/derivation/metrics.gonode/derivation/static_scan_test.gonode/derivation/tag_advance.gonode/derivation/tag_advance_test.gonode/derivation/verify.gonode/derivation/verify_path_b.gonode/derivation/verify_path_b_test.gonode/flags/flags.gonode/ops-morph/docker-compose-validator.ymlnode/ops-morph/docker-compose.ymlnode/ops-morph/testnet/docker-compose.ymlnode/ops-morph/testnet/static-nodes.jsonnode/validator/config.gonode/validator/validator.gonode/validator/validator_test.goops/devnet-morph/devnet/__init__.pyops/docker-sequencer-test/Dockerfile.l2-geth-testops/docker-sequencer-test/Dockerfile.l2-node-testops/docker-sequencer-test/docker-compose.override.ymlops/docker-sequencer-test/run-test.shops/docker-sequencer-test/scripts/tx-generator.shops/docker/docker-compose-4nodes.ymlops/docker/docker-compose-reth.ymlops/docker/static-nodes.jsonoracle/docker-compose.yml
💤 Files with no reviewable changes (6)
- node/validator/config.go
- node/validator/validator_test.go
- node/validator/validator.go
- node/blocktag/config.go
- node/blocktag/service.go
- node/ops-morph/docker-compose-validator.yml
| // Reuse the shared syncer instance -- DevSequencer mode is the only path | ||
| // that pulls a syncer out of NewExecutor, so we hand back the same one | ||
| // rather than letting NewExecutor open a second store + syncer. | ||
| newSyncerFunc := func() (*sync.Syncer, error) { return syncer, nil } | ||
| executor, err = node.NewExecutor(newSyncerFunc, nodeConfig, pubKey) | ||
| if err != nil { | ||
| return err | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Check if sync.Syncer has internal locking or concurrency documentation
# Check for mutex/lock fields in the Syncer struct
rg -nP --type=go -A 20 'type Syncer struct' node/sync/
# Check for concurrent access documentation or warnings
rg -nP --type=go -C 5 'concurrent|thread.?safe|goroutine' node/sync/syncer.goRepository: morph-l2/morph
Length of output: 1102
🏁 Script executed:
# Check how syncer is used in executor initialization
rg -nP --type=go -B 5 -A 10 'NewExecutor' node/cmd/node/main.go
# Check executor's use of syncer
fd -e go -path "*/executor*" | head -20
# Check derivation client initialization and syncer usage
rg -nP --type=go -B 3 -A 10 'derivation|Derivation' node/cmd/node/main.go | head -50Repository: morph-l2/morph
Length of output: 2505
🏁 Script executed:
# Find where derivation is initialized with syncer
rg -nP --type=go 'derivation\.' node/cmd/node/main.go | head -30
# Check derivation initialization signature
fd -name derivation.go | xargs rg -nP 'func New.*Derivation' -A 5
# Check if derivation stores and uses syncer
fd -name "*.go" -path "*/derivation/*" -type f
# Check how executor and derivation are used (concurrent or sequential)
rg -nP --type=go 'go ' node/cmd/node/main.go | grep -E '(executor|derivation|dvNode)'Repository: morph-l2/morph
Length of output: 1357
🏁 Script executed:
# Check if dvNode and executor run concurrently
rg -nP --type=go 'go dvNode|go executor|dvNode\.Start|executor\.' node/cmd/node/main.go
# Check syncer methods to see if they have internal synchronization
rg -nP --type=go 'func.*Syncer.*\(' node/sync/syncer.go | head -20
# Check Syncer's actual method implementations for locks/synchronization
rg -nP --type=go 'sync\.Mutex|sync\.RWMutex|\bLock\(\)|\bUnlock\(' node/sync/
# Look at how derivation uses syncer
rg -nP --type=go 'syncer\.' node/derivation/derivation.go | head -20Repository: morph-l2/morph
Length of output: 678
🏁 Script executed:
# Check Syncer.Start() implementation to see if it uses goroutines
rg -nP --type=go -A 20 'func.*Start.*\{' node/sync/syncer.go | grep -A 20 'func (s \*Syncer) Start'
# Check fetchL1Messages for goroutines
rg -nP --type=go -A 15 'func.*fetchL1Messages' node/sync/syncer.go
# Check how latestSynced is modified
rg -nP --type=go 'latestSynced\s*=' node/sync/syncer.go
# Check executor's use of syncer
rg -nP --type=go -B 3 -A 10 'newSyncerFunc\|syncer\.' node/core/ | head -40Repository: morph-l2/morph
Length of output: 1322
🏁 Script executed:
# Check if executor actually uses syncer after initialization
rg -nP --type=go -B 5 -A 15 'type Executor' node/core/executor.go | head -50
# Search for syncer usage in executor
rg -nP --type=go 'syncer\.' node/core/executor.go | head -20
# Check what NewExecutor does with the syncer function
rg -nP --type=go -A 30 'func NewExecutor' node/core/executor.go | head -50Repository: morph-l2/morph
Length of output: 1704
🏁 Script executed:
# Verify when syncer is started
rg -nP --type=go 'syncer.*Start\|syncer.*New' node/cmd/node/main.go
# Check the full flow: when is syncer created vs when is dvNode started
sed -n '77,151p' node/cmd/node/main.go
# Check if derivation.Start() is async (goroutine)
rg -nP --type=go -B 2 -A 10 'func.*Derivation.*Start' node/derivation/derivation.go | head -30Repository: morph-l2/morph
Length of output: 3244
🏁 Script executed:
# Check if executor calls newSyncerFunc or uses syncer
rg -nP --type=go 'newSyncerFunc\|e\.syncer' node/core/executor.go
# Check the full Start method of Derivation to confirm syncer.Start() timing
rg -nP --type=go -A 30 'func.*Derivation.*Start' node/derivation/derivation.goRepository: morph-l2/morph
Length of output: 875
🏁 Script executed:
# Check for atomic operations on latestSynced
rg -nP --type=go 'atomic\.' node/sync/syncer.go
# Check if there are any channels or other sync mechanisms
rg -nP --type=go 'chan\|sync\.' node/sync/syncer.go | grep -v 'package\|import'
# Verify latestSynced access patterns
rg -nP --type=go 'latestSynced' node/sync/syncer.go
# Check derivation's use of syncer - does it call LatestSynced or ReadL1MessagesInRange?
rg -nP --type=go 's\.latestSynced\|LatestSynced\|ReadL1MessagesInRange' node/derivation/derivation.goRepository: morph-l2/morph
Length of output: 733
🏁 Script executed:
# Check if there are any other synchronization mechanisms (channels, etc)
cat node/sync/syncer.go | head -100
# Check how ReadL1MessagesInRange accesses latestSynced
rg -nP --type=go -B 5 -A 10 'ReadL1MessagesInRange' node/sync/syncer.goRepository: morph-l2/morph
Length of output: 3258
Syncer requires synchronization for concurrent access to latestSynced field.
The sync.Syncer struct shares the latestSynced field (uint64) between the background polling goroutine spawned by Start() and external readers (e.g., LatestSynced()). The fetchL1Messages() method reads and writes this field without any mutex or atomic protection, creating a data race when the syncer is concurrently accessed by the derivation client.
The latestSynced field is read at line 131 (loop initialization) and lines 136-137 (logging), and written at line 167, all within fetchL1Messages() which runs in a background goroutine. This unsynchronized concurrent access can cause memory corruption or unpredictable behavior.
Add sync.Mutex to protect latestSynced or use sync/atomic for the uint64 operations.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@node/cmd/node/main.go` around lines 122 - 129, The Syncer has unsynchronized
concurrent reads/writes to the uint64 field latestSynced (accessed by
fetchL1Messages(), the background goroutine started in Start(), and readers via
LatestSynced()), causing a data race; fix by protecting all accesses to
latestSynced: either add a sync.Mutex (e.g., s.mu) and wrap every read/write in
fetchL1Messages() and LatestSynced() with s.mu.Lock()/Unlock(), or convert
latestSynced to a uint64 and use sync/atomic functions
(atomic.LoadUint64/atomic.StoreUint64) in fetchL1Messages(), Start() and
LatestSynced() so all reads/writes are atomic and race-free. Ensure you update
any logging that reads latestSynced to use the protected/atomic read.
Implements SPEC-005 derivation verification: - verify_local.go: Path B local-rebuild blob verification (rebuild blockContext + L2 tx blob from local chain, compare against on-chain Rollup batch) - verify.go: extract verifyBatchRoots, gate stateException on real divergence verdicts (not transient errors) - finalizer.go + reorg.go: derivation-driven finalizer + L1 reorg detection (SPEC-005 §4.7.6), rewind-and-reset for canonicality - tag_advance.go + metrics.go: derivation-driven L2 tag management, structured failure diagnostics for Path B Common: export common/batch.BuildBlockContext for derivation reuse. go-ethereum: bump submodule to 045be0fdc7ca (NewL2BlockV2 self-heal). Ops: add second sentry node for derivation validation (4-nodes compose, node5 key, devnet setup). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaced by derivation-driven tag management (see prior commit): - node/validator/*: removed dedicated validator role - node/blocktag/*: removed standalone block-tag advancer service - node/cmd/node/main.go: drop validator/blocktag wiring - node/flags/flags.go: drop validator-specific CLI flags - ops-morph/docker-compose-validator.yml: drop validator compose file Tags are now advanced inline by the derivation loop, eliminating the extra service and the role-based branching in main. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- sync/syncer.go: make Syncer.Start idempotent via sync.Once so re-entry on retry no longer leaks goroutines or races on state - types/retryable_client.go (+test): treat ethereum.NotFound as a permanent failure rather than retrying forever - db/keys.go + db/store.go: derivation-related key helpers used by the new finalizer / tag-advance paths Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After PR #935 (multi_batch) was squash-merged into main, refresh deps across all 9 modules. Keep PR's morph-l2/go-ethereum version v1.10.14-0.20260508105911-56deb7072ae4 (which contains NewL2BlockV2 required by the Path B self-heal flow), while taking main's other indirect updates from multi_batch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
092f068 to
973ab81
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
ops/devnet-morph/devnet/setup_nodes.py (1)
61-67:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMissing
sentry-node-1in persistent_peers configuration.The testnet command now creates 6 nodes including
sentry-node-1, butpersistent_peers_valueonly includes 5 peers (up tosentry-node-0). This meanssentry-node-1won't be discovered via persistent peers.🔧 Proposed fix to add sentry-node-1 to persistent peers
persistent_peers_value = ( "93e27ea2306e158a8146d5f44caaab97496797d2@node-0:26656," "7f78b7d7a7e6bad4faf68d5731d437f4288d96d0@node-1:26656," "06c699be2f9aeb9f7ec79f508a95ff80576deb12@node-2:26656," "b1a131f40d5d3abefe0dd787513c936ef62ac2d6@node-3:26656," - "dae813274913aaf39e7cd3226a0aa8bce00644e1@sentry-node-0:26656" + "dae813274913aaf39e7cd3226a0aa8bce00644e1@sentry-node-0:26656," + "<node5_id>`@sentry-node-1`:26656" )Note: You'll need to determine the correct node ID for
sentry-node-1(checknode5/config/node_key.jsonafter generation).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ops/devnet-morph/devnet/setup_nodes.py` around lines 61 - 67, persistent_peers_value is missing the entry for sentry-node-1 so the testnet won't discover it; after generating node keys, read the node ID from node5/config/node_key.json and append an entry "NODE_ID@sentry-node-1:26656" to the persistent_peers_value string (match the existing comma-separated format), ensuring the new peer uses the actual node ID you found; update the persistent_peers_value definition so it includes sentry-node-1 alongside the existing sentry-node-0 entry.ops/docker/docker-compose-4nodes.yml (1)
484-496:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDuplicate environment variable
TX_SUBMITTER_SEAL_BATCH.
TX_SUBMITTER_SEAL_BATCH=trueis set twice intx-submitter-0(lines 484 and 495). Remove one of the duplicates.🔧 Proposed fix
- - TX_SUBMITTER_SEAL_BATCH=true - TX_SUBMITTER_METRICS_SERVER_ENABLE=false - TX_SUBMITTER_METRICS_HOSTNAME=0.0.0.0 - TX_SUBMITTER_METRICS_PORT=6060 - TX_SUBMITTER_TX_FEE_LIMIT=500000000000000000 `#0.5e` - TX_SUBMITTER_LOG_FILENAME=tx_submitter.log - TX_SUBMITTER_LOG_FILE_MAX_SIZE=100 `#MB` - TX_SUBMITTER_LOG_FILE_MAX_AGE=7 `#day` - TX_SUBMITTER_LOG_COMPRESS=true - TX_SUBMITTER_L1_STAKING_ADDRESS=${MORPH_L1STAKING:-0x5fc8d32690cc91d4c39d9d3abcbd16989f875707} - TX_SUBMITTER_L1_STAKING_DEPLOYED_BLOCKNUM=0 - TX_SUBMITTER_SEAL_BATCH=true🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ops/docker/docker-compose-4nodes.yml` around lines 484 - 496, The docker-compose env block for the tx-submitter service contains a duplicate variable TX_SUBMITTER_SEAL_BATCH (present in the tx-submitter-0 env list); remove one of the occurrences so TX_SUBMITTER_SEAL_BATCH=true is defined only once, keeping the single canonical entry and leaving all other TX_SUBMITTER_* variables unchanged.
🧹 Nitpick comments (3)
ops/devnet-morph/devnet/setup_nodes.py (1)
108-110: 💤 Low valueMisleading error message after refactoring.
The error message says "Missing source or destination" but at this point only the source directory is being checked (destination was already validated above).
✏️ Suggested fix
if not os.path.isdir(source_dir): - print(f"Error: Missing source or destination directory for {node}. Exiting.") + print(f"Error: Missing source directory for {node}. Exiting.") sys.exit(1)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ops/devnet-morph/devnet/setup_nodes.py` around lines 108 - 110, The error message in the check for source_dir is misleading: update the conditional handling around source_dir (the if not os.path.isdir(source_dir) block that prints the error for node) to report only the missing source directory (e.g., "Error: Missing source directory for {node}. Exiting.") or otherwise include accurate context that destination was already validated; adjust the process that calls print/fails so the message references source_dir and node explicitly.node/derivation/static_scan_test.go (1)
18-54: 💤 Low valueThe repo-root discovery assumes a fixed directory depth.
filepath.Dir(wd)assumes the test always runs from exactly.../morph/node/derivation. If someone runs tests from a different working directory or the package is nested differently,nodeRootwill point to the wrong location. Consider walking upward until you find a marker file (e.g.,go.modwithmodule morph-l2) or usinggo list -m -f '{{.Dir}}'to locate the module root reliably.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@node/derivation/static_scan_test.go` around lines 18 - 54, walkNodeRepoSourceFiles currently uses nodeRoot := filepath.Dir(wd) which assumes the test is run from .../morph/node/derivation; change it to reliably locate the repo/module root by climbing parent directories from wd until you find a marker (e.g., a go.mod containing the expected module name "morph-l2") or by invoking "go list -m -f '{{.Dir}}' <module>" and using that directory as nodeRoot; update references in the function (wd, nodeRoot) so the file walk uses the discovered root and fail the test with a clear error if the marker/command lookup fails.node/derivation/derivation.go (1)
282-299: 💤 Low valueTODO comment indicates incomplete work.
The
// TODO reorgcomment at line 284 suggests reorg handling logic is not yet implemented in this code path. The current behavior falls through to self-heal viafetchRollupDataByTxHash+deriveForce, but the comment implies additional reorg handling may be needed. Consider clarifying what's missing or removing the TODO if the current self-heal path is the intended behavior.Do you want me to open an issue to track this TODO?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@node/derivation/derivation.go` around lines 282 - 299, The TODO "reorg" comment indicates missing reorganization handling in the loop comparing rebuilt vs batchInfo.blobHashes; either implement explicit reorg handling or remove the TODO and document that the current self-heal path is the intended behavior. If implementing reorg handling, add code in the loop when rebuilt[i] != batchInfo.blobHashes[i] to invoke the reorg resolution routine (e.g., call an existing or new method like handleReorg or reconcileChain) before falling back to fetching full batch via fetchRollupDataByTxHash and re-deriving with deriveForce(batchInfoFull); otherwise delete the TODO and add a brief comment stating that mismatches are handled by fetchRollupDataByTxHash + deriveForce to avoid misleading future readers.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@node/db/store.go`:
- Around line 214-217: The loop that calls s.db.Has(key) currently treats any
Has error as "key missing" and breaks, which swallows DB errors; change the
logic in the reorg cleanup loop so that if s.db.Has(key) returns a non-nil err
you raise/panic with that error (preserving the codebase's panic-on-error
pattern for DB operations), and only break the loop when err == nil && has ==
false; keep references to s.db.Has and key when making this change so the check
is: if err != nil -> panic(err); else if !has -> break.
In `@node/derivation/derivation.go`:
- Around line 703-739: The anonymous closure that calls d.l2Client.NewL2BlockV2
(producing err) is not checked before proceeding to d.l2Client.HeaderByNumber;
capture and check the returned err from the closure (the variable named err
after the closure invocation) and if non-nil return it (or wrap it with context)
instead of continuing, so failures from NewL2BlockV2 are propagated and
HeaderByNumber is only called when execData was successfully written.
In `@node/derivation/verify_local.go`:
- Around line 203-216: The docs/comments are misleading: rebuildBlob (function
rebuildBlob / call site in verify_local.go) only validates blob count and
returns rebuilt blobs; per-index hash equality is performed later in
derivation.go (the loop comparing rebuilt[i] to batchInfo.blobHashes[i]). Update
the comment above rebuildBlob to state it only checks length and returns rebuilt
blobs (not per-hash equality), and update the ErrBatchVerifyDivergence /
versioned_hash_mismatch comment mapping in verify.go to indicate divergence is
detected at the derivation.go comparison site rather than inside rebuildBlob;
adjust any comment text referencing "rebuildBlob wraps
divergence/versioned_hash_mismatch" to reflect the current behavior.
In `@node/sync/syncer.go`:
- Around line 92-113: The code calls s.fetchL1Messages() twice: once
synchronously before the goroutine and again immediately inside the goroutine
loop; remove the redundant in-goroutine immediate call by changing the goroutine
loop to wait on the ticker (or select on ctx.Done and t.C) before calling
s.fetchL1Messages() so that the initial synchronous fetch (the one before the
goroutine) is the first fetch, leaving the goroutine to poll only on t.C; update
the comment near the ticker (t := time.NewTicker / "don't wait for ticker during
startup") to reflect this behavior and ensure the goroutine still closes s.stop
on s.ctx.Done.
---
Outside diff comments:
In `@ops/devnet-morph/devnet/setup_nodes.py`:
- Around line 61-67: persistent_peers_value is missing the entry for
sentry-node-1 so the testnet won't discover it; after generating node keys, read
the node ID from node5/config/node_key.json and append an entry
"NODE_ID@sentry-node-1:26656" to the persistent_peers_value string (match the
existing comma-separated format), ensuring the new peer uses the actual node ID
you found; update the persistent_peers_value definition so it includes
sentry-node-1 alongside the existing sentry-node-0 entry.
In `@ops/docker/docker-compose-4nodes.yml`:
- Around line 484-496: The docker-compose env block for the tx-submitter service
contains a duplicate variable TX_SUBMITTER_SEAL_BATCH (present in the
tx-submitter-0 env list); remove one of the occurrences so
TX_SUBMITTER_SEAL_BATCH=true is defined only once, keeping the single canonical
entry and leaving all other TX_SUBMITTER_* variables unchanged.
---
Nitpick comments:
In `@node/derivation/derivation.go`:
- Around line 282-299: The TODO "reorg" comment indicates missing reorganization
handling in the loop comparing rebuilt vs batchInfo.blobHashes; either implement
explicit reorg handling or remove the TODO and document that the current
self-heal path is the intended behavior. If implementing reorg handling, add
code in the loop when rebuilt[i] != batchInfo.blobHashes[i] to invoke the reorg
resolution routine (e.g., call an existing or new method like handleReorg or
reconcileChain) before falling back to fetching full batch via
fetchRollupDataByTxHash and re-deriving with deriveForce(batchInfoFull);
otherwise delete the TODO and add a brief comment stating that mismatches are
handled by fetchRollupDataByTxHash + deriveForce to avoid misleading future
readers.
In `@node/derivation/static_scan_test.go`:
- Around line 18-54: walkNodeRepoSourceFiles currently uses nodeRoot :=
filepath.Dir(wd) which assumes the test is run from .../morph/node/derivation;
change it to reliably locate the repo/module root by climbing parent directories
from wd until you find a marker (e.g., a go.mod containing the expected module
name "morph-l2") or by invoking "go list -m -f '{{.Dir}}' <module>" and using
that directory as nodeRoot; update references in the function (wd, nodeRoot) so
the file walk uses the discovered root and fail the test with a clear error if
the marker/command lookup fails.
In `@ops/devnet-morph/devnet/setup_nodes.py`:
- Around line 108-110: The error message in the check for source_dir is
misleading: update the conditional handling around source_dir (the if not
os.path.isdir(source_dir) block that prints the error for node) to report only
the missing source directory (e.g., "Error: Missing source directory for {node}.
Exiting.") or otherwise include accurate context that destination was already
validated; adjust the process that calls print/fails so the message references
source_dir and node explicitly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b7ce05ac-47bc-41d5-9574-120c48dfde23
⛔ Files ignored due to path filters (10)
bindings/go.sumis excluded by!**/*.sumcommon/go.sumis excluded by!**/*.sumcontracts/go.sumis excluded by!**/*.sumgo.work.sumis excluded by!**/*.sumnode/go.sumis excluded by!**/*.sumops/l2-genesis/go.sumis excluded by!**/*.sumops/tools/go.sumis excluded by!**/*.sumoracle/go.sumis excluded by!**/*.sumtoken-price-oracle/go.sumis excluded by!**/*.sumtx-submitter/go.sumis excluded by!**/*.sum
📒 Files selected for processing (33)
bindings/go.modcommon/batch/batch_cache.gocommon/go.modcontracts/go.modnode/db/keys.gonode/db/store.gonode/derivation/batch_decode.gonode/derivation/batch_info.gonode/derivation/config.gonode/derivation/config_test.gonode/derivation/database.gonode/derivation/derivation.gonode/derivation/finalizer.gonode/derivation/metrics.gonode/derivation/reorg.gonode/derivation/static_scan_test.gonode/derivation/tag_advance.gonode/derivation/tag_advance_test.gonode/derivation/verify.gonode/derivation/verify_local.gonode/flags/flags.gonode/go.modnode/sync/syncer.gonode/types/retryable_client.gonode/types/retryable_client_test.goops/devnet-morph/devnet/setup_nodes.pyops/docker/docker-compose-4nodes.ymlops/docker/node5/node_key.jsonops/l2-genesis/go.modops/tools/go.modoracle/go.modtoken-price-oracle/go.modtx-submitter/go.mod
💤 Files with no reviewable changes (1)
- node/derivation/batch_decode.go
✅ Files skipped from review due to trivial changes (8)
- ops/docker/node5/node_key.json
- tx-submitter/go.mod
- ops/tools/go.mod
- common/go.mod
- node/go.mod
- node/types/retryable_client_test.go
- oracle/go.mod
- node/derivation/batch_info.go
🚧 Files skipped from review as they are similar to previous changes (3)
- common/batch/batch_cache.go
- node/derivation/tag_advance_test.go
- node/derivation/tag_advance.go
| has, err := s.db.Has(key) | ||
| if err != nil || !has { | ||
| break | ||
| } |
There was a problem hiding this comment.
Do not swallow Has errors during reorg cleanup.
Line 215 currently treats DB errors the same as “key missing” and exits the loop. That can silently truncate deletes and leave stale derivation L1 block records after rewind.
🔧 Suggested fix
func (s *Store) DeleteDerivationL1BlocksFrom(height uint64) {
batch := s.db.NewBatch()
for h := height; ; h++ {
key := DerivationL1BlockKey(h)
has, err := s.db.Has(key)
- if err != nil || !has {
+ if err != nil {
+ panic(fmt.Sprintf("failed to check DerivationL1Block existence at %d, err: %v", h, err))
+ }
+ if !has {
break
}
if err := batch.Delete(key); err != nil {
panic(fmt.Sprintf("failed to delete DerivationL1Block at %d, err: %v", h, err))
}
}Based on learnings: DB write methods in this codebase intentionally follow a panic-on-error pattern and new methods should preserve that behavior.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| has, err := s.db.Has(key) | |
| if err != nil || !has { | |
| break | |
| } | |
| func (s *Store) DeleteDerivationL1BlocksFrom(height uint64) { | |
| batch := s.db.NewBatch() | |
| for h := height; ; h++ { | |
| key := DerivationL1BlockKey(h) | |
| has, err := s.db.Has(key) | |
| if err != nil { | |
| panic(fmt.Sprintf("failed to check DerivationL1Block existence at %d, err: %v", h, err)) | |
| } | |
| if !has { | |
| break | |
| } | |
| if err := batch.Delete(key); err != nil { | |
| panic(fmt.Sprintf("failed to delete DerivationL1Block at %d, err: %v", h, err)) | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@node/db/store.go` around lines 214 - 217, The loop that calls s.db.Has(key)
currently treats any Has error as "key missing" and breaks, which swallows DB
errors; change the logic in the reorg cleanup loop so that if s.db.Has(key)
returns a non-nil err you raise/panic with that error (preserving the codebase's
panic-on-error pattern for DB operations), and only break the loop when err ==
nil && has == false; keep references to s.db.Has and key when making this change
so the check is: if err != nil -> panic(err); else if !has -> break.
| // block node startup during initial sync and print some helpful logs | ||
| s.logger.Info("initial sync start", "msg", "Running initial sync of L1 messages before starting sequencer, this might take a while...") | ||
| s.fetchL1Messages() | ||
| s.logger.Info("initial sync completed", "latestSyncedBlock", s.latestSynced) | ||
|
|
||
| go func() { | ||
| t := time.NewTicker(s.pollInterval) | ||
| defer t.Stop() | ||
|
|
||
| for { | ||
| // don't wait for ticker during startup | ||
| s.fetchL1Messages() | ||
|
|
||
| select { | ||
| case <-s.ctx.Done(): | ||
| close(s.stop) | ||
| return | ||
| case <-t.C: | ||
| continue | ||
| } | ||
| } | ||
| }() |
There was a problem hiding this comment.
Redundant fetchL1Messages call on goroutine start.
fetchL1Messages() is called synchronously on line 94 before the goroutine starts, then immediately called again on line 103 at the start of the goroutine loop. The comment "don't wait for ticker during startup" on line 102 appears to be stale after the refactoring that added the synchronous initial call.
🔧 Suggested fix: wait for ticker before first in-goroutine fetch
go func() {
t := time.NewTicker(s.pollInterval)
defer t.Stop()
for {
- // don't wait for ticker during startup
- s.fetchL1Messages()
-
select {
case <-s.ctx.Done():
close(s.stop)
return
case <-t.C:
- continue
+ s.fetchL1Messages()
}
}
}()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // block node startup during initial sync and print some helpful logs | |
| s.logger.Info("initial sync start", "msg", "Running initial sync of L1 messages before starting sequencer, this might take a while...") | |
| s.fetchL1Messages() | |
| s.logger.Info("initial sync completed", "latestSyncedBlock", s.latestSynced) | |
| go func() { | |
| t := time.NewTicker(s.pollInterval) | |
| defer t.Stop() | |
| for { | |
| // don't wait for ticker during startup | |
| s.fetchL1Messages() | |
| select { | |
| case <-s.ctx.Done(): | |
| close(s.stop) | |
| return | |
| case <-t.C: | |
| continue | |
| } | |
| } | |
| }() | |
| go func() { | |
| t := time.NewTicker(s.pollInterval) | |
| defer t.Stop() | |
| for { | |
| select { | |
| case <-s.ctx.Done(): | |
| close(s.stop) | |
| return | |
| case <-t.C: | |
| s.fetchL1Messages() | |
| } | |
| } | |
| }() |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@node/sync/syncer.go` around lines 92 - 113, The code calls
s.fetchL1Messages() twice: once synchronously before the goroutine and again
immediately inside the goroutine loop; remove the redundant in-goroutine
immediate call by changing the goroutine loop to wait on the ticker (or select
on ctx.Done and t.C) before calling s.fetchL1Messages() so that the initial
synchronous fetch (the one before the goroutine) is the first fetch, leaving the
goroutine to poll only on t.C; update the comment near the ticker (t :=
time.NewTicker / "don't wait for ticker during startup") to reflect this
behavior and ensure the goroutine still closes s.stop on s.ctx.Done.
|
Actionable comments posted: 0 |
Scope
Implements the three targets of morph-l2/morph-specs SPEC-005 (tracking issue):
c82a1456): Deletenode/validator/plus all call sites / flags / docker-compose entries / monitoring panels.9f01c530): Add--derivation.verify-mode={pathA,pathB}(defaultpathA), bound at startup and not switchable at runtime. Path B does not pull blobs from beacon — it rebuilds versioned hashes locally from L2 blocks using the sequencer-consistent encoding and compares them against the values recorded on L1.a73639de): Delete the entirenode/blocktag/package; the safe / finalized heads are advanced solely by the derivation main loop + the in-process finalizer subcomponent; thefinalized ≤ safe ≤ unsafeinvariant is explicitly asserted; theSetBlockTagsengine RPC signature is unchanged.Design doc: tech-design.md
Base dependency
Base =
feat/multi_batch(PR #935). Reusescommon/batch.{ParsingTxs, BuildBlockContext, BatchData}+common/blob.{CompressBatchBytes, MakeBlobTxSidecar}. This PR will need a rebase once #935 lands onmain.Breaking change
The external semantics of
eth_getBlockByNumber("safe")change from "L1 N confirmations" → "locally derivation-verified". The driver offinalizedmoves from blocktag to the derivation-internal finalizer. Downstream consumers (wallets / indexers / bridge relayers) must be informed via changelog; the morph repository itself has 0 business call sites (only EL protocol-layer references), so external consumers will be notified separately. See tech-design §6 / §7.Test coverage
static_scan_test.go(3):TestNoValidatorReferences/TestNoBlocktagReferences/TestPathBUsesCommonBlobPackagetag_advance_test.go(7): safe / dedup / invariant ×2 / monotonic / l1-reorg-reset / block-number-errorTestVerifierDispatch_*(3) /TestPathA_*(2) /TestPathB_*(4) /TestTagAdvance_FinalizedFromL1/TestTagAdvance_PathBVsPathAParityDevnet verification
Pending (7 scenarios listed in spec §5.2, including an L1 reorg triggering a one-shot safe rewind while finalized stays put).
Rollback
Hard revert via
git revert+ redeploy (no feature-flag dual-run window; rationale in tech-design §7).Code review
@coderabbitai reviewto force a run, or wait until feat: multi blob #935 merges and this PR is rebased ontomain.BlockNumberinfinalizer.tick" have been applied).Refs
Summary by CodeRabbit
New Features
Breaking Changes
Chores